Improving Fault Tolerant Resource Optimized Aware Job Scheduling for Grid Computing
نویسندگان
چکیده
Workflow brokers of existing Grid Scheduling Systems are lack of cooperation mechanism which causes inefficient schedules of application distributed resources and it also worsens the utilization of various resources including network bandwidth and computational cycles. Furthermore considering the literature, all of these existing brokering systems primarily evolved around models of centralized hierarchical or client/server. In such models, vital responsibility such as resource discovery is delegated to the centralized server machines, thus they are associated with well-known disadvantages regarding single point of failure, scalability and network congestion at links that are leading to the server. In order to overcome these issues, we implement a new approach for decentralized cooperative workflow scheduling in a dynamically distributed resource sharing environment of Grids. The various actors in the system namely the users who belong to multiple control domains, workflow brokers and resources work together enabling a single cooperative resource sharing environment. But this approach ignored the fact that each grid site may have its own fault-tolerance strategy because each site is itself an autonomous domain. For instance, if a grid site handles the job check-pointing mechanism, each computation node must have the ability of periodical transmission of transient state of the job execution by computational node to the server. When there is a failure of job, it will migrate to another computational node and resume from the last stored checkpoint. A Glow worm Swarm Optimization (GSO) for job scheduling is used to address the issue of heterogeneity in fault-tolerance of computational grid but Weighted GSO that overcomes the position update imperfections of general GSO in a more efficient manner shown during comparison analysis. This system supports four kinds of fault-tolerance mechanisms, including the job migration, job retry, checkpointing and the job replication mechanisms also considering risk nature of Grid computing environment. The risk relationship between jobs and nodes are defined by the security demand and the trust level. Our evaluation based simulation results show that our algorithm has shorter makespan and more efficient. We also analyze the efficiency of the proposed approach against a centralized coordinated workflow scheduling technique and show that our approach is more efficient than the centralized technique with respect to achieving highly coordinated schedules.
منابع مشابه
Stability Assessment Metamorphic Approach (SAMA) for Effective Scheduling based on Fault Tolerance in Computational Grid
Grid Computing allows coordinated and controlled resource sharing and problem solving in multi-institutional, dynamic virtual organizations. Moreover, fault tolerance and task scheduling is an important issue for large scale computational grid because of its unreliable nature of grid resources. Commonly exploited techniques to realize fault tolerance is periodic Checkpointing that periodically ...
متن کاملSecurity Aware Parallel and Independent Job Scheduling in Grid Computing Environments Based on Adaptive Job Replication
In grid environment, jobs may be scheduled to multiple machines across different administrative domains. However, grid security is a main hurdle to make the job scheduling decision secure, reliable and fault tolerant. A security-aware parallel and independent job scheduling algorithm in grid computing environment based on adaptive job replications was proposed. In risky and failure-prone grids,...
متن کاملOptimized Assignment of Independent Task for Improving Resources Performance in Computational Grid
Grid computing has emerged from category of distributed and parallel computing where the heterogeneous resources from different network are used simultaneously to solve a particular problem that need huge amount of resources. Potential of Grid computing depends on my issues such as security of resources, heterogeneity of resources, fault tolerance & resource discovery and job scheduling. Schedu...
متن کاملEfficient Resource Management Mechanism with Fault Tolerant Model for Computational Grids
Grid computing provides a framework and deployment environment that enables resource sharing, accessing, aggregation and management. It allows resource and coordinated use of various resources in dynamic, distributed virtual organization. The grid scheduling is responsible for resource discovery, resource selection and job assignment over a decentralized heterogeneous system. In the existing sy...
متن کاملFuzzy Logic-Based Secure and Fault Tolerant Job Scheduling in Grid
The uncertainties of grid sites security are main hurdle to make the job scheduling secure, reliable and fault-tolerant. Most existing scheduling algorithms use fixed-number job replications to provide fault tolerant ability and high scheduling success rate, which consume excessive resources or can not provide sufficient fault tolerant functions when grid security conditions change. In this pap...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- JCS
دوره 10 شماره
صفحات -
تاریخ انتشار 2014